Multilayer Perceptrons struggle to learn certain simple arithmetic tasks. Specialist neural modules for arithmetic can outperform classical architectures with gains in extrapolation, interpretability and convergence speeds, but are highly sensitive to the training range. In this paper, we show that Neural Multiplication Units (NMUs) are unable to reliably learn tasks as simple as multiplying two inputs when given different training ranges. Causes of failure are linked to inductive and input biases which encourage convergence to solutions in undesirable optima. A solution, the stochastic NMU (sNMU), is proposed to apply reversible stochasticity, encouraging avoidance of such optima whilst converging to the true solution. Empirically, we show that stochasticity provides improved robustness with the potential to improve learned representations of upstream networks for numerical and image tasks.
translated by 谷歌翻译
有证据表明,视觉通信前面的书面语言,并为其返回到史前的基础,以洞穴和岩石绘画描绘了我们遥远的祖先的痕迹。紧急沟通研究试图探索代理商如何学习沟通,以便协作解决任务。现有研究专注于语言,具有学习的通信信道在代理之间发送离散令牌的序列。在这项工作中,我们探索了允许使用简单笔划绘制的代理之间的可视通信通道。我们的代理商由深神经网络参数化,绘图程序是可微分的,允许最终的培训。在参考通信游戏的框架中,我们证明了代理商不仅可以通过绘图成功学习,而是通过适当的归纳偏见,可以以人类可以解释的方式这样做。我们希望鼓励未来的研究将视觉沟通视为培训协作代理人的更灵活和直接的可解释的替代方案。
translated by 谷歌翻译
神经算术逻辑模块已成为一个不断增长的领域,尽管仍然是一个利基领域。这些模块是神经网络,旨在在学习算术和/或逻辑操作中实现系统的概括,例如$ \ {+, - ,\ times,\ div,\ leq,\ leq,\ textrm {and} \} $,同时也可以解释。本文是首次讨论该领域进度的现状,从神经算术逻辑单元(NALU)开始解释关键作品。为了关注Nalu的缺点,我们提供了深入的分析,以理论有关最近模块的设计选择。在实验设置和发现上进行了模块之间的交叉比较,我们在基本实验中强调了不一致,导致无法直接比较跨论文。为了减轻现有的不一致之处,我们创建了一个基准,比较了所有现有的算术nalms。我们通过对NALU的现有应用和需要进一步探索的研究方向进行新的讨论来结束。
translated by 谷歌翻译
Recent work has reported that AI classifiers trained on audio recordings can accurately predict severe acute respiratory syndrome coronavirus 2 (SARSCoV2) infection status. Here, we undertake a large scale study of audio-based deep learning classifiers, as part of the UK governments pandemic response. We collect and analyse a dataset of audio recordings from 67,842 individuals with linked metadata, including reverse transcription polymerase chain reaction (PCR) test outcomes, of whom 23,514 tested positive for SARS CoV 2. Subjects were recruited via the UK governments National Health Service Test-and-Trace programme and the REal-time Assessment of Community Transmission (REACT) randomised surveillance survey. In an unadjusted analysis of our dataset AI classifiers predict SARS-CoV-2 infection status with high accuracy (Receiver Operating Characteristic Area Under the Curve (ROCAUC) 0.846 [0.838, 0.854]) consistent with the findings of previous studies. However, after matching on measured confounders, such as age, gender, and self reported symptoms, our classifiers performance is much weaker (ROC-AUC 0.619 [0.594, 0.644]). Upon quantifying the utility of audio based classifiers in practical settings, we find them to be outperformed by simple predictive scores based on user reported symptoms.
translated by 谷歌翻译
Since early in the coronavirus disease 2019 (COVID-19) pandemic, there has been interest in using artificial intelligence methods to predict COVID-19 infection status based on vocal audio signals, for example cough recordings. However, existing studies have limitations in terms of data collection and of the assessment of the performances of the proposed predictive models. This paper rigorously assesses state-of-the-art machine learning techniques used to predict COVID-19 infection status based on vocal audio signals, using a dataset collected by the UK Health Security Agency. This dataset includes acoustic recordings and extensive study participant meta-data. We provide guidelines on testing the performance of methods to classify COVID-19 infection status based on acoustic features and we discuss how these can be extended more generally to the development and assessment of predictive methods based on public health datasets.
translated by 谷歌翻译
The UK COVID-19 Vocal Audio Dataset is designed for the training and evaluation of machine learning models that classify SARS-CoV-2 infection status or associated respiratory symptoms using vocal audio. The UK Health Security Agency recruited voluntary participants through the national Test and Trace programme and the REACT-1 survey in England from March 2021 to March 2022, during dominant transmission of the Alpha and Delta SARS-CoV-2 variants and some Omicron variant sublineages. Audio recordings of volitional coughs, exhalations, and speech were collected in the 'Speak up to help beat coronavirus' digital survey alongside demographic, self-reported symptom and respiratory condition data, and linked to SARS-CoV-2 test results. The UK COVID-19 Vocal Audio Dataset represents the largest collection of SARS-CoV-2 PCR-referenced audio recordings to date. PCR results were linked to 70,794 of 72,999 participants and 24,155 of 25,776 positive cases. Respiratory symptoms were reported by 45.62% of participants. This dataset has additional potential uses for bioacoustics research, with 11.30% participants reporting asthma, and 27.20% with linked influenza PCR test results.
translated by 谷歌翻译
多个现有基准测试涉及视频中的跟踪和分割对象,例如,视频对象细分(VOS)和多对象跟踪和分割(MOTS)(MOTS),但是由于使用不同的基准标准数据集和指标,它们之间几乎没有相互作用(例如J&F,J&F,J&F,J&F,地图,smotsa)。结果,已发表的作品通常针对特定的基准,并且不容易相互媲美。我们认为,可以解决多个任务的广义方法的发展需要在这些研究子社区中更大的凝聚力。在本文中,我们旨在通过提出爆发来促进这一点,该数据集包含数千个带有高质量对象掩码的视频,以及一个相关的基准标准,其中包含六个任务,涉及视频中的对象跟踪和细分。使用相同的数据和可比较的指标对所有任务进行评估,这使研究人员能够一致考虑它们,因此更有效地从不同任务的不同方法中汇集了知识。此外,我们为所有任务展示了几个基线,并证明可以将一个任务的方法应用于另一个任务,并具有可量化且可解释的性能差异。数据集注释和评估代码可在以下网址获得:https://github.com/ali2500/burst-benchmark。
translated by 谷歌翻译
秋季检测和分类成为医疗保健应用特殊性的不良问题,因为人口越来越老化。目前,大多数秋季化算法都提供二进制秋季或无效分类。为了获得更好的医疗保健,因此不足以进行二元秋季分类,而是将其扩展到多个秋季事件分类。在这项工作中,我们利用缓解人类骨架数据的隐私性进行多个秋季事件分类。从原始的RGB图像中提取了骨骼功能,不仅可以减轻个人隐私,还可以减少动态照明的影响。提出的秋季事件分类方法分为两个阶段。在第一阶段,该模型经过训练以实现二进制分类以滤除无腹部事件。然后,在第二阶段,对深神经网络(DNN)模型进行了训练,以进一步对五种类型的秋季事件进行分类。为了确认所提出的方法的效率,上下数据集上的实验优于最先进的实验。
translated by 谷歌翻译
在从同质机器人群到异构人类自治团队的多机构团队的运营中,可能会发生意外的事件。虽然对多代理任务分配问题的操作效率是主要目标,但决策框架必须足够聪明,可以用有限的资源来管理意外的任务负载。否则,操作效率将大幅下降,而超载的代理人面临不可预见的风险。在这项工作中,我们为多机构团队提供了一个决策框架,以通过分散的强化学习来考虑负载管理,以学习负载管理,并避免了不必要的资源使用。我们说明了负载管理对团队绩效的影响,并在示例场景中探索了代理行为。此外,在处理潜在的超负荷情况时,开发了一种衡量协作中的代理重要性的衡量标准。
translated by 谷歌翻译
由于其在建模复杂操作方面的性能和灵活性,变压器在计算机视觉中变得普遍。特别重要的是“交叉注意”操作,它允许通过参与任意大小的输入功能集来学习一个向量表示(例如,图像中的对象)。最近,提出了“掩盖注意力”,其中给定的对象表示仅关注那些对象的分割掩码处于活动状态的图像像素功能。这种注意力的专业证明对各种图像和视频细分任务有益。在本文中,我们提出了另一种专业化的注意力,该专业能够通过“软遮罩”(具有连续遮罩概率而不是二进制值的那些软遮罩)参加,并且通过这些掩码概率也可以差异化,从而允许学习掩模用于注意的掩模。在网络中无需直接损失监督。这对于多种应用程序可能很有用。具体而言,我们对弱监督视频对象细分(VOS)的任务采用了“可区分的软掩盖注意力”,在该任务中,我们为VOS开发了一个基于变压器的网络,该网络仅需要单个带注释的图像框架,但也可以仅带有一个带注释的框架的视频中的循环一致性培训受益。尽管没有标记的框架中的口罩没有损失,但由于我们的新型注意力表述,该网络仍然能够在这些框架中细分对象。代码:https://github.com/ali2500/hodor/blob/main/main/hodor/modelling/encoder/soft_masked_attention.py
translated by 谷歌翻译